bias variance trade-off
Bias Variance Trade-Off: Overfitting and Underfitting
The inability of a machine learning model to truly capture the relationship in training data. As shown in the above models, M1 is unable to describe the relationship in training data, as well as M3, describe perfectly well. M2 describes in between them. It is nothing but the difference between the error in training data and testing data. As shown in the above models, M1, as well as M2, has low variance cause of the difference in errors and M3 has high variance.
Making Sense of the Bias / Variance Trade-off in (Deep) Reinforcement Learning
Since the launch of the ML-Agents platform a few months ago, I have been surprised and delighted to find that thanks to it and other tools like OpenAI Gym, a new, wider audience of individuals are building Reinforcement Learning (RL) environments, and using them to train state-of-the-art models. The ability to work with these algorithms, previously something reserved for ML PhDs, is opening up to a wider world. As a result, I have had the unique opportunity to not just write about applying RL to existing problems, but also to help developers and researchers debug their models in a more active way. In doing so, I often get questions which come down to a matter of understanding the unique hyperparameters and learning process around the RL paradigm. In this article, I want to attempt to highlight one of these conceptual pieces: bias and variance in RL, and attempt to demystify it to some extent.